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ABSTRACT 

A new segmentation fusion method is proposed that ensem¬ 
bles the output of several segmentation algorithms applied 
on a remotely sensed image. The candidate segmentation 
sets are processed to achieve a consensus segmentation using 
a stochastic optimization algorithm based on the Filtered 
Stochastic BOEM (Best One Element Move) method. Eor 
this purpose, Eiltered Stochastic BOEM is reformulated as 
a segmentation fusion problem by designing a new distance 
learning approach. The proposed algorithm also embeds the 
computation of the optimum number of clusters into the 
segmentation fusion problem. 

Index Terms — Segmentation, clustering, fusion, consen¬ 
sus, stochastic optimization. 


Eiltered Stochastic BOEM method as a segmentation fusion 
problem, where we design a new distance learning method. 
The second contribution is to embed the computation of the 
optimal cluster number into the Eiltered Stochastic BOEM 
method. In the suggested framework, we assume that some 
of the segments in the candidate segmentation set are ex¬ 
pected to represent acquired target objects. 

Three well-known segmentation algorithms, k-means, 
Graph Cuts [2], [3], [4] and Mean Shift [5], [6] are used 
as the base segmentation algorithms in order to segment 
benchmark hyperspectral image datasets. In the next section, 
we introduce our segmentation fusion method. We examine 
the suggested method with various experiments in Section 
3. Section 4 concludes the paper. 


I. INTRODUCTION 

In hyper-spectral remote sensing problems, it is difficult 
to find an optimal segmentation algorithm that covers all the 
spectral bands. Some objects are recognized on specific spec¬ 
tral bands, whereas other objects may require the processing 
of different bands together. Eor example, the algorithms with 
a set of selected parameters may successfully detect objects 
such as water and shadow in the near-infrared (NIR) band, 
but may fail to detect objects which provide color or textural 
information, such as farms and buildings. Therefore, one 
may need to employ more than one segmentation output ob¬ 
tained from multiple spectral bands to extract various types 
of objects. Additionally, depending on the object types, one 
may need to employ more than one set of features in the 
segmentation algorithms. 

In this study, we introduce a new approach for the seg¬ 
mentation fusion problem based on a consensus clustering 
algorithm, called Stochastic Eiltered Best One Element Move 
(Eiltered Stochastic BOEM) [1]. The proposed method can 
also be employed to find the optimal set of parameters for a 
segmentation algorithm for a dataset. We first employ differ¬ 
ent segmentation algorithms or a single segmentation algo¬ 
rithm with a set of different parameters to a remote sensing 
image and obtain a set of candidate outputs. Then, we design 
a fusion strategy by adapting the Eiltered Stochastic BOEM 
method. There are two major contributions of the proposed 
segmentation fusion method. The first is to formalize the 


II. FILTERED STOCHASTIC BOEM FORMULATED 
EOR THE FUSION OF SEGMENTATION 
ALGORITHMS 

Filtered Stochastic BOEM [1] is a consensus clustering 
algorithm which approximates a solution to the Median Par¬ 
tition Problem [7] by integrating BOEM [7] and Stochastic 
Gradient Descent (SGD) [8]. 

In the proposed segmentation fusion method, we first feed 
an image I to D segmentation algorithms SAj, j = 1,..., D. 
Each segmentation algorithm is employed on I to obtain a 
set of segmentation outputs Sj = where Si € is 

a segmentation output, A is the set of segment labels with N 
pixels with \A\ = C different segment labels, and a distance 
function d{- ,■). 

An initial segmentation s is selected from the segmenta¬ 
tion set S' = Q Sj consisting of K = kj segmenta- 

i=i 

tions using algorithms which employ search heuristics, such 
as Best of K (BOK) [7]. Then, a consensus segmentation s 
is computed by solving the following optimization problem: 

K 

s = argmin ^d(si,s) . 

S . T 

Given two segmentations Si and Sj, the distance function 
is defined as the symmetric distance function (SDD) given 
by d(si, Sj) = Nqi + Niq, where Nqi is the number of pairs 
co-segmented in Si but not in Sj, and Niq is the number of 


pairs co-segmented in Sj but not in Si. In order to compare 
segmentations with a different number of pixels N and seg¬ 
mentations K, we use a normalized form of SDD which is 
called Average Sum of Distances (Average SOD) 

KN{N-1) ^ ^ 

At each iteration of the optimization algorithm, a new 
segmentation is computed. Specifically, a segmentation s is 
randomly selected from the segmentation set. Then, the best 
one element move of the current segmentation s is computed 
with respect to the objective of the optimization and applied 
to the current segmentation to generate a new segmentation. 
If there is no improvement on the best move, the current 
segmentation is returned by the algorithm. 

Similar to the gradient descent method, the best one ele¬ 
ment move of segmentation s is defined as 

d d{si,s) 

A ^=1 


and can be evaluated by As* = where Ht = d{si, St) 

i—1 

is the objective at time t. Using the assumption that single 
element updates do not change the objective function, Ht can 
be approximated by Ht-i with a scale parameter /3 € (0,1). 
Then, 

d 

^st = + di^Sk, St)) , 

OSt 

where Sk is the randomly selected segmentation for updating 
the current BOEM. If an A x U matrix [H] is defined such 
that the row and the column of the matrix, [H]ij, is 
the updated value of H obtained by switching element 
of s to the segment label, the move can be approximated 
by 

argmin + [d(sfc, , (2) 

ij 

if the segmentation is selected for updating st at time t. 

In the proposed Segmentation Fusion Algorithm, we ini¬ 
tialize [Ht] at f = 1. Until t reaches a given termination 
time T, we update the segmentation s. We randomly select 
a segmentation from a pseudo-random permutation of the 
numbers 1 , 2 ,..., AT until we traverse all the segmentations 
in si, S 2 , ■■■, Sk- Then, we generate a new segmentation and 
repeat this operation until all of the permutations are tra¬ 
versed. We update [Ht] by aggregating [(i(sfc,s)] with the 
scaled P[Ht]. [3 G [0,1] controls the convergence rate and 
the performance of the algorithm. If /3 = 0, the algorithm 
becomes pure stochastic BOEM and the algorithm is mem¬ 
oryless. If /? = 1, the algorithm forgets slowly. However, 
Zheng, Kulkarni and Poor [1] reported that the algorithm 
may perform worse if (3 is on either end of [0,1]. Selection 
of the optimal /? values for segmentation fusion is explained 
in the next section. After [Ht] is updated, we compute As 


input : Input image I, T. 

output: Output image O 

1 Run SAj on I to obtain Sj = {sijfli, 

Vj = l,...,i^. 

2 At f = 1, initialize s and [Ht] 

for < ^ 2 to T do 

3 Randomly select one of the segmentation results 
fcG{l,2,...,A} 

4 [Ht] ^[Ht] + [d(sfc,5)] 

5 Find As by solving a.Tgmm/3[Ht]ij 

6 s ^ — s As 

7 t i — f -f 1 

end 

8 O S 

Algorithm 1: Segmentation Fusion 


in order to update s. We iterate the algorithm until the ter¬ 
mination criterion is achieved. 


II-A. Distance Learning 

In this section, we propose a method, called distance learn¬ 
ing that employs the training data to measure the distance 
between two segmentations obtained at the output of differ¬ 
ent segmentation algorithms. The proposed distance learning 
method is also flexible for measuring the distance between 
two segmentations with different numbers of segments. 

We first define Rand Index (RI), which is used to estimate 
the quality of the segments. Given two segmentations Si 
and Sj, RI is defined as RI(^Si,Sj) = 1 — where 

d{si, Sj) = Nio + Noi = (^) - (Aqo -I- All). However, RI 
is not corrected for chance, for instance, the average distance 
between two segmentations is not zero and the distance de¬ 
pends on the number of pixels [9]. Therefore, we assume 
that each segmentation Si = {si^ki}^Ai consists of different 
numbers of segments Ki. We define Hji) as the number of 
pixels in the segment of st, and Ky as the number of 
pixels in both the segment of Si and the segment of 
Sj. In addition, we assume that Si and Sj are randomly drawn 
with a fixed number of segments, and a fixed number pixels 
in each segment according to a generalized hypergeometric 
distribution [10]. Then, an adjusted version of RI called 
Adjusted Rand Index (ARI) [10] is defined as 


ARI{si, Sj) = 


sA.EAi C^’)- 

h {di 3- 0j) — dij 


(3) 


where 0, = E^Li ^7 = EJLi {V) and 

a _ 29iej 
“ N{N-1)- 

Note that, if we apply our assumptions for equal segmen¬ 
tation sizes Ki = Kj,\/i j in ARI, we obtain (1) [7]. 









Instead, we compute ARI (s^, sj) for each different base seg¬ 
mentation algorithm output Si with different segment num¬ 
bers Ki and d{si,Sj) is computed from ARI{si,Sj), such 
that c?(si, Sj) = l—ARI{si,Sj) [7]. We call this method Dis¬ 
tance Learning for BOEM (DL) in which we learn d{si, Sj) 
by computing ARI{si, Sj) using the data. 

An important assumption that is made in the derivation of 
ARI [11] is that the number of pixels in each segment is the 
same. However, this assumption may fail in the segmentation 
of images that contain complex targets, such as airports or 
harbors. 

In order to relax this assumption, we employ a normaliza¬ 
tion method for quasi-distance functions, introduced by Luo 
et al. [12] as 


nd{si,Sj) 


^j') djYiijii^S i ^ S j 


(4) 


where dmin{si, Sj) and dmaxisi, Sj) are the minimal and 
maximal values of d{si,Sj). Luo et al. [12] states that the 
exact computation of dmax{si, Sj) for any segmentation dis¬ 
tribution is not known and they introduce several approxi¬ 
mations. In the experiments, we employ (4) as the method 
called Quasi-distance Learning(QD). For the details of the 
algorithms to solve (4), please refer to [12]. 

An important difference between (3) and (4) is that we 
consider the minimal and maximal values of the distances 
between the pairwise segmentations {si,Sj) as the normal¬ 
ization factors, in order to compute the distances between Si 
and Sj, in (4). On the other hand, (3) considers the expected 
values of the distances between all of the segmentations in 
the computations. 

If the training data is available, then dmin{si, Sj) and 
dmax{si,Sj) can be computed using the training data and 
employed to test data. However, one must assure that the 
statistical properties of training and test data are equivalent 
in order to employ the learning methods. We observe that 
this equivalent requirement may not be satisfied in remote 
sensing datasets in the experiments because of the variability 
of the images in the context of space and time. 


II-B. Estimating Number of Clusters and parameters /3 
for BOEM 

One of the crucial problems of image segmentation is 
to estimate the number of clusters that forms different seg¬ 
ments, C, in the image. This problem is very crucial for the 
segmentation of remotely sensed images even if the images 
are labeled using expert knowledge. 

In order to estimate C in the base segmentation algo¬ 
rithms, several clustering validity indices can be employed 
[13]. In this section, we introduce a new method to esti¬ 
mate C for segmentation fusion. For this purpose, we con¬ 
sider a segmentation index (SI) for BOEM as SI{c) = 
J2i<j ARI{si,Sj), where is the set of K segmen¬ 

tations where each segmentation Si contains segments with 


c different labels [14]. Then, we solve the following opti¬ 
mization problem, 

C= argmin SI{c) , (5) 

C=2.....Cmax 

where Cmax is the maximum value of c provided by the 
user. Vinh and Epps [14] compared Normalized Mutual In¬ 
formation and ARI for the estimation of segment number on 
several datasets. Since both of the algorithms agree on the 
segment number in various experiments, we employ ARI in 
our experiments for estimating c. 

A similar approach is employed to estimate the parameter 
P. Given a set of /3 values S = {Pb}t=i’ we introduce a 
beta index (BI) as BI{Pb) = ARI{si, 0(/?b)), where 
0{Pb) is the output segmentation of the Segmentation Fu¬ 
sion Algorithm implemented using /?(,. Then the optimal /3 is 
computed by solving the following optimization algorithm: 

P = argmin BI{Pb) . (6) 

III. EXPERIMENTS 

We use two indices to measure the (dis)similarity between 
an output image O and the ground truth of the images as 
performance criteria: i) Rand Index (RI), and ii) Adjusted 
Rand Index {ARI) [9], which takes values in [0,1]. When the 
output image O and the ground truth image are identical, the 
ARI and the RI are equal to 1. Moreover, the ARI equals 
0 when the RI equals its expected value. 

Table I: Performance of the Algorithms for Thematic Map¬ 
per Image 

Average Base Algorithm 1 DL QD 

RI 0.703 0.704 0.710 0.714 

ARI 0.159 0.160 0.184 0.174 

In the first set of experiments, we employ the proposed 
segmentation fusion algorithms on 7 band Thematic Mapper 
Image which is provided by MultiSpec [15]. We split the 
image with size 169 x 169 into training and test images: i) 
a subset of the pixels with coordinates a; = (1 : 169) and 
y = (1 : 90) is taken as the training image and ii) a subset of 
the pixels with coordinates a: = (1 : 169) and y = (91 : 142) 
is taken as the test image. In the images, there are (7 = 6 
clusters corresponding to different segments. 

We first implement k-means on J = 7 different bands, in 
order to perform multi-modal data fusion. The termination 
time of Filtered Stochastic BOEM is set to T = 1000. 
Assuming that we do not know the number of clusters C 
in the image, we employ (5) using the training data in order 
to find the optimal C for c = 2, 3,4, 5,6, 7,8, 9,10. Then, 
we find (7 = 6 with ARI = 0.2648. We employ (6) for 
S = {0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,0.99} and find 
P = 0.9 with ARI = 0.2648. The results of the experiments 
on the test data of Thematic Mapper Image are given in Table 



I. In the Average Base column, the average performance 
values of k-means algorithms are given. The performance 
values of the segmentation fusion algorithm are given in the 
column labeled Algorithm 1. We observe that the perfor¬ 
mance values of Algorithm 1 are similar to the arithmetic 
average of the performance values of k-means algorithms. 
The performance of Distance Learning and Quasi-distance 
Learning algorithms, are given in DL and QD, respectively. 
Since distance functions for Algorithm 1 are computed us¬ 
ing the segmentation-wise ARI values in DL and QD, we 
observe that performance increases in the ARI values of DL 
and QD compared to Algorithm 1. 

In the second set of the experiments, we employ k-means. 
Graph Cut and Mean Shift algorithms on 7-band training 
and test images. Now, the image segmentation problem is 
considered as a pixel clustering problem in 7 dimensional 
spaces. We find (7 = 6 and /3 = 0.9 with ARI = 0.267 
using the training data. The results on the test data are 
given in Table II. The performance values of Algorithm 
1 are closer to the performance values of the Mean Shift 
algorithm, since the output image of Algorithm 1 is closer 
to the output segmentation of the Mean Shift algorithm. 
We observe that the ARI values of DL are greater than the 
values of QD, since DL computes the distance functions 
by computing the ARI values between the segmentations. 
However, the RI values of QD are greater than the values 
of DL, since QD calibrates distance functions considering 
the distance measure of the RI. 

Table II: Experiments using k-means. Graph Cut and Mean 
Shift Algorithms on 7-band Images 



k-means 

Graph Cut 

MeanShift 

RI 

0.715 

0.717 

0.714 

ARI 

0.125 

0.132 

0.176 


Algorithm 1 

DL 

QD 

RI 

0.714 

0.710 

0.724 

ARI 

0.176 

0.180 

0.178 


In the third set of experiments, we employ k-means algo¬ 
rithm on each band of 12-band Moderate Dimension Image: 
June 1966 aircraft scanner Flightline Cl (Portion of Southern 
Tippecanoe County, Indiana) [15]. The size of the image is 
949 X 220, and there are 11 clusters in the ground truth of the 
image [15]. We randomly select 104390 pixels for training 
and the remaining 104390 pixels for testing. We find (7 = 11 
and /3 = 0.9 with ARI = 0.004 using the training data. The 
results on the test data are given in Table III and Table IV. 
We observe that the performance values for Algorithm 1 are 
smaller than the average performance values of base segmen¬ 
tation outputs. Since the distance functions are computed for 
each segmentation pair, we achieve better performance for 
distance learning algorithms (DL and QD). 

IV. CONCLUSION 

In this study, we introduce a new approach for the fu¬ 
sion of the segmentation outputs of several segmentation 


Table III: Performance of k-means Algorithms for Moderate 
Dimension Image 



Chi 

Ch2 

Ch3 

Ch4 

ChS 

Ch6 

RI 

0.537 

0.531 

0.528 

0.532 

0.532 

0.523 

ARI 

0.014 

0.006 

0.009 

0.009 

0.006 

-0.003 


Ch7 

Ch8 

Ch9 

ChlO 

Chll 

Chl2 

RI 

0.529 

0.531 

0.534 

0.527 

0.540 

0.540 

ARI 

0.000 

0.008 

0.015 

-0.003 

0.023 

0.018 


Table IV: Performance of the Algorithms for Moderate 
Dimension Image 

Average Base Algorithm 1 DL QD 

RI 0.532 0.530 0.533 0.530 

ARI 0.009 0.007 0.011 0.011 


algorithms to achieve a consensus segmentation. Therefore, 
the output segmentation fusion algorithm can be interpreted 
as the image representing the mutual information on a set 
of segmentation outputs obtained from various segmentation 
algorithms. 

We construct the candidate segmentation set by using the 
k-means. Mean Shift and Graph Cuts methods applied on 
the hyper-spectral images. The parameter optimization of the 
segmentation is embedded into the Filtered stochastic BOEM 
method. Additionally, the distance metrics are learned us¬ 
ing the training data in order to enhance the segmentation 
performance without preselecting parameters, or evaluating 
the outputs for specific targets. The performances of the 
suggested segmentation fusion algorithm demonstrates its 
efficacy in compromising over-segmented results and under¬ 
segmented results. 
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